Frequent Frames and Grammatical Categorization Categorizing Words Using "Frequent Frames": What Cross-Linguistic Analyses Reveal About Distributional Acquisition Strategies
نویسندگان
چکیده
Mintz (2003) described a distributional environment called a frame, defined as the cooccurrence of two context words with one intervening target word. Analyses of English childdirected speech showed that words that fell within any frequently occurring frame consistently belonged to the same grammatical category (e.g., noun, verb, adjective, etc.). In this paper, we first generalize this result to French, whose function word system allows patterns that are potentially detrimental to a frame-based analysis procedure. Second, we show that the discontinuity of the chosen environments –i.e., the fact that target words are framed by the context words– is crucial for the mechanism to be efficient. This property might be relevant for any computational approach to grammatical categorization. Finally, we investigated a recursive application of the procedure and observed that the categorization is paradoxically worse when context elements are categories rather than actual lexical items. Item-specificity is thus also a core computational principle for this type of algorithm. Our analysis, along with results from behavioral studies (Gómez, 2002; Gómez and Maye, 2005; Mintz, 2006), provide strong support for frames as a basis for the acquisition of grammatical categories by infants. Discontinuity and item-specificity appeared to be crucial features. Frequent Frames and Grammatical Categorization 2 Grammatical categories such as noun, verb, and adjective are the building blocks of linguistic structure. Identifying the categories of words allows infants and young children to learn about the syntactic properties of their language. Thus, understanding how infants and young children learn the categories of words in their language is crucial for any theory of language acquisition. In addition, knowledge of word categories and the syntactic structures in which they participate may aid learners in acquiring word meaning (Gleitman, 1990; Gleitman, Cassidy, Nappa, Papafragou and Trueswell, 2005; Landau and Gleitman, 1985). In their introductory text on syntactic theory, Koopman, Sportiche and Stabler (2003) describe the main concepts that allow linguists to posit syntactic categories: “a category is a set of expressions that all ‘behave the same way’ in language. And the fundamental evidence for claims about how a word behaves is the distribution of words in the language: where can they appear, and where would they produce nonsense, or some other kind of deviance.” These observations are fundamentally at the core of the notions behind structural linguistics in the early 20 century (Bloomfield, 1933; Harris, 1951), namely, that form-class categories were defined by cooccurrence privileges. Maratsos and Chalkley (1980) advanced the proposal that children may use distributional information of this type as a primary basis for categorizing words. In the past decade, a number of studies have investigated how useful purely distributional information might be to young children in initially forming categories of words (Cartwright & Brent, 1997; Mintz, 2003; Mintz, Newport, & Bever, 2002; Redington, Chater, & Finch, 1998). Employing a variety of categorization procedures, these investigations demonstrated that lexical co-occurrence patterns in child-directed speech could provide a robust source of information for children to correctly categorize nouns and verbs, and to some degree other form-class categories as well. One challenge in forming categories from distributional cues is to establish an efficient balance between the detection of the especially informative contexts and the rejection of the potentially misleading ones. For example, in (1), that cat and mat both occur after the suggests that the two words belong to the same category. However, applying this very same reasoning to example (2) would lead one to conclude that large and mat belong to the same category (see Pinker, 1987, for related arguments). (1) the cat is on the mat (2) the large cat is on the mat To address the problem of the variability of informative distributional contexts, the procedures developed by Redington et al. (1998) and Mintz et al. (2002) took into account the entire range of contexts a word occurred in, and essentially classified words based on their distributional profiles across entire corpora. While in (1) and (2), the adjective large shares a preceding context with cat and mat, in other utterances it occurs in environments that would not be shared with nouns, as in (3). Many misclassifications that would occur if only individual occurrences of a target word were considered turned out not to result when taking into account the statistical information about the frequency of a target word occurring across different contexts. 1 Mintz et al. and Redington et al. also incorporated more distributional positions into their analysis than just the immediately preceding word, e.g., the following word, words that were two positions before or after, etc. However, the addition of contexts does not, a priori ̧ make the potential for misclassifications go away. Frequent Frames and Grammatical Categorization 3 (3) the cat on the mat is large Mintz (2003) took a different approach. Rather than starting with target words and tallying the entire range of contexts in which they occur, the basis for his categorization is a particular type of contexts which he called frequent frames, defined as two words that frequently co-occur in a corpus with exactly one word intervening. (Schematically, we indicate a frame as [A x B] with A and B referring to the co-occurring words and x representing the position of the target words.) For example, in (3), [the x on] is a frame that contains the word cat; it so happens that in the English child-directed corpora investigated by Mintz (2003), this frame contained exclusively nouns, leading to a virtually error-free grouping together of nouns. Examining many frames in child-directed speech, Mintz demonstrated that in English, frames that occur frequently contain intervening words that almost exclusively belong to the same grammatical category. He proposed that frequent frames could be the basis for children’s initial lexical categories. One critical aspect of frequent frames is that the framing words—e.g., the and on in the example above—must frequently co-occur. Arguably, co-occurrences that are frequent are not accidental (as infrequent co-occurrences might be), but rather arise from some kind of constraint in the language. In particular, structural constraints governed by the grammar could give rise to this kind of co-occurrence regularity. It is not surprising, then, that the words categorized by a given frequent frame play a similar structural role in the grammar—i.e., they belong to the same category. Thus, in the frequent frames approach, the important computational work involves identifying the frequent frames. Once identified, categorization is simply a matter of grouping together the words that intervene in a given frequent frame throughout a corpus. In contrast, in other approaches (Mintz et al., 2002; Redington et al., 1998) the crucial computations involved tracking the statistical profile of each of the most frequent words with respect to all the contexts in which it occurs, and comparing the profiles of each word with all the other words. Thus, an advantage of the frequent frames categorization process is that, once a set of frequent frames has been identified, a single occurrence of an uncategorized word in a frequent frame would be sufficient for categorization. Moreover, it is computationally simpler, in that fewer total contexts are involved in analyzing a corpus. In addition to research showing the informativeness and computational efficiency of frequent frames (in English), several behavioral studies suggest that infants attend to frame-like patterns and may use them to categorize novel words. For example, Gómez (2002) showed that sufficient variability in intervening items allowed 18-month-old infants to detect frame-like discontinuous regularities, and Gómez and Maye (2005) showed that this ability was already detectable in 15-month-olds. This suggests that the resources required to detect frequent frames is within the ability of young infants. Second, Mintz (2006) showed that English-learning 12month-olds categorize together novel words when they occur within actual frequent frames (e.g., infants categorized bist and lonk together when they heard both words used in the [you X the] frequent frame). Although frequent frames have been shown to be a simple yet robust source of lexical category information, the analyses have been limited to English. One goal of the present paper is to start to test the validity of frequent frames cross-linguistically. To this end, in Experiment 1, Frequent Frames and Grammatical Categorization 4 we test the validity of frequent frames in French, a language which presents several potentially problematic features for the frame-based procedure. An additional goal was to characterize the core computational principles that make frequent frames such robust environments for categorization. To this end, in Experiment 2 in both French and English, we compare frames with other types of contexts that are at first sight very similar to frames in terms of their intrinsic informational content and structure: [A B x] and [x A B]. Interestingly, despite the similarity of these contexts to frames, they yielded much poorer categorization. The results of this experiment suggest that co-occurring context elements must frame a target word. Finally, in Experiment 3 we investigated the consequences of a recursive application of this frame-based procedure, again with French and English corpora. Specifically, we performed an initial analysis to derive frame-based categories, then reanalyzed the corpus defining frames based on the categories of words derived in the initial analysis. A somewhat counterintuitive finding was that the recursive application of the frame-based procedure resulted in relatively poor categorization. This finding suggests that computations based on specific items—words—as opposed to categories, is a core principle in categorizing words, at least initially. Experiment 1: French Frequent Frames This first experiment investigates the viability of the frequent frames proposal for French. Several features of the language suggest that frequent frames may be less efficient in French than in English. For example, English frequent frames heavily relied on closed-class words, such as determiners, pronouns, and prepositions. In French, there is homophony between clitic object pronouns and determiners, le/la/les, which could potentially give rise to erroneous generalizations. For instance, la in ‘la pomme’ (the apple) is an article and precedes a noun, whereas la in ‘je la mange’ (I eat it) is a clitic object pronoun and precedes a verb. There are also a greater number of determiners, which could result in less comprehensive categories. For instance, French has three different definite determiners, le/la/les, varying in gender and number, that all translate into the in English. Finally, constructions involving object clitics in French exclude many robust English frame environments, e.g. [I x it], a powerful verb-detecting frame in English, translates into [je le/la x] in French, which is not a frame. Do French frequent frames nevertheless provide robust category information, as in English?
منابع مشابه
Categorizing words using 'frequent frames': what cross-linguistic analyses reveal about distributional acquisition strategies.
Mintz (2003) described a distributional environment called a frame, defined as the co-occurrence of two context words with one intervening target word. Analyses of English child-directed speech showed that words that fell within any frequently occurring frame consistently belonged to the same grammatical category (e.g. noun, verb, adjective, etc.). In this paper, we first generalize this result...
متن کاملFREQUENT FRAMES 1 Running head: FREQUENT FRAMES Frequent Frames As A Cue For Grammatical Categories In Child Directed Speech
This paper introduces the notion of frequent frames, distributional patterns based on cooccurrence patterns of words in sentences, then investigates the usefulness of this information in grammatical categorization. A frame is defined as two jointly occurring words with one word intervening. Qualitative and quantitative results from distributional analyses of six different corpora of child direc...
متن کاملFrequent frames as a cue for grammatical categories in child directed speech.
This paper introduces the notion of frequent frames, distributional patterns based on co-occurrence patterns of words in sentences, then investigates the usefulness of this information in grammatical categorization. A frame is defined as two jointly occurring words with one word intervening. Qualitative and quantitative results from distributional analyses of six different corpora of child dire...
متن کاملWhat’s in the input
Recent analyses have revealed that child-directed speech contains distributional regularities that could, in principle, support young children’s discovery of distinct grammatical categories (noun, verb, adjective). In particular, a distributional unit known as the frequent frame appears to be especially informative (Mintz, 2003). However, analyses have focused almost exclusively on the distribu...
متن کاملA universal cue for grammatical categories in the input to children: Frequent frames
How does a child map words to grammatical categories when words are not overtly marked either lexically or prosodically? Recent language acquisition theories have proposed that distributional information encoded in sequences of words or morphemes might play a central role in forming grammatical classes. To test this proposal, we analyze child-directed speech from seven typologically diverse lan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007